Apparatus and method for detecting foreground in image

ABSTRACT

An apparatus and method for detecting a foreground in an image is provided, and the foreground detecting apparatus includes a context information estimator configured to estimate context information on a scene from an image frame of the image, a background model constructor configured to construct a background model of the image frame using the estimated context information, and a foreground detector configured to detect a foreground from the image frame based on the constructed background model.

CROSS-REFERENCE TO RELATED APPLICATION(S)

This application claims the benefit under 35 USC § 119(a) of Korean Patent Application No. 10-2016-0008335, filed on Jan. 22, 2016, in the Korean Intellectual Property Office, the entire disclosure of which is incorporated herein by reference for all purposes.

BACKGROUND

1. Field

The following description relates to an image processing technology, and particularly, a technology for detecting a foreground in an image input from a moving camera.

2. Description of Related Art

A moving object in an image is detected using different methods. In the case of a static camera, background subtraction is used. In background subtraction, it is assumed that similar colors enter the same location on a background region over time, a background is learned and the learned background is subtracted from an input image to find a moving object, i.e., a foreground region. In the case of a moving camera, a background also has motion due to camera movement, thus, a method of performing background subtraction after generating a large panorama image using image stitching, and a method of performing background subtraction after compensating for background motion occurring due to camera movement in a learned background model are proposed. However, a panorama-image based method has difficulty determining a stitching location and may be limited by the amount of memory required. In addition, a compensation-based method has difficulty in estimating background motion, which leads to difficulty in detecting a foreground when a background speed is significantly high.

SUMMARY

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

In one general aspect, there is provided an apparatus for detecting a foreground in an image, the apparatus including a processor configured to estimate context information on a scene from an image frame of the image, construct a background model of the image frame using the estimated context information, and detect a foreground from the image frame based on the constructed background model.

The processor may include a context information estimator configured to estimate the context information on the scene from the image frame of the image, a background model constructor configured to construct the background model of the image frame using the estimated context information, and a foreground detector configured to detect the foreground from the image frame based on the constructed background model.

The context information may include any one or any combination of background motion, a foreground speed, and an illumination change.

The context information estimator may include a background motion estimator configured to calculate a speed using an optical flow of the image frame and a previous image frame, and to generate a projective transform matrix that represents the background motion using the speed, a foreground speed estimator configured to estimate a foreground speed by correcting the speed based on the generated projective transform matrix, and an illumination change estimator configured to estimate the illumination change based on a difference between a mean of brightness intensities of the image frame and a mean of brightness intensities of the background model.

The background model constructor may include a background model generator configured to generate the background model of the image frame using a background model of a previous image frame and the background motion.

The background model generator may be configured to determine locations on the previous image frame based on a location of the previous image frame which corresponds to a first location of the image frame, and to generate a background model of the first location by performing weighted summation on background models of the determined locations.

The background model generator may be configured to correct a variance of the generated background model, in response to a value of a context variable based on the foreground speed being equal to or larger than a block size.

The background model generator may be configured to correct a variance of the background model of the first location by reflecting differences between a mean of the background model of the first location and means of the background models at the locations.

The background model constructor may include a background model updater configured to update the generated background model based on the illumination change.

The background model updater may be configured to update a mean of the generated background model by reflecting the illumination change, in response to value of a context variable being smaller than a block size, and to update the mean and a variance of the generated background model by reflecting the illumination change and additional information, in response to the value of the context variable being equal to or larger than the block size.

The additional information may include any one or any combination of a mean of block-specific brightness intensities and information on a difference between a mean of the generated background model and the image frame.

The background model updater may calculate a value of a time-variant variable in units of blocks starting from an initial image frame input from the camera to the image frame, and adjusts an update intensity of the generated background model using the value of the time-variant variable.

The foreground detector may include a foreground probability map generator configured to generate a foreground probability map by calculating a foreground probability for each pixel of the image frame based on the constructed background model, and a foreground extractor configured to extract the foreground from the image frame using the generated foreground probability map.

The foreground extractor may be configured to classify the each pixel of the image frame into at least one of the foreground, a candidate, and a background based on a comparison of the foreground probability of the each pixel of the image frame with a threshold value, and to categorize pixels classified into the candidate into the foreground or the background based on applying watershed segmentation to the pixels classified into the candidate.

The apparatus may include a memory configured to store instructions, and wherein the processor may be configured to execute the instructions to estimate context information on a scene from an image frame of the image, to construct a background model of the image frame using the estimated context information, and to detect a foreground from the image frame based on the constructed background model.

The previous image frame may comprise the immediately previous image frame.

In another general aspect, there is provided a method of detecting a foreground in an image, the method including estimating context information on a scene from an image frame of the input image, constructing a background model of the image frame using the estimated context information, and detecting a foreground from the image frame based on the constructed background model.

The context information may include any one or any combination of background motion, a foreground speed, and an illumination change.

The constructing of the background model may include generating the background model of the image frame using a background model of a previous image frame based on information on the background motion.

The constructing of the background model may include correcting a variance of the generated background model in response to a value of a context variable based on the foreground speed being equal to or larger than a block size.

The constructing of the background model may include updating the generated background model based on the illumination change.

The updating of the background model may include updating a mean of the generated background model by reflecting the illumination change, in response to a value of a context variable being smaller than a block size, and updating a mean and a variance of the generated background model by reflecting the illumination change and additional information, in response to the value of the context variable being equal to or larger than the block size.

The detecting of a foreground may include generating a foreground probability map by calculating a foreground probability for each pixel of the image frame based on the constructed background model, and extracting the foreground from the image frame using the generated foreground probability map.

In another general aspect, there is provided a digital device including an antenna, a cellular radio configured to transmit and receive data via the antenna according to a cellular communications standard, a touch-sensitive display, a camera configured to capture an image, a memory configured to store instructions, and wherein the processor is further configured to execute the instructions to receive the image from the camera, estimate context information on a scene from an image frame of the image, construct a background model of the image frame by amending a background model of a previous frame based on the estimated context information, and detect a foreground from the image frame based on the constructed background model.

Other features and aspects will be apparent from the following detailed description, the drawings, and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram illustrating an example of a foreground detecting apparatus.

FIGS. 2A to 2C are diagrams illustrating examples of parts of a foreground detecting apparatus 100 shown in FIG. 1.

FIGS. 3A-3C are diagrams illustrating examples of estimation of a foreground speed.

FIG. 4 is an example of a diagram illustrating generation of a background model for an input image.

FIGS. 5A-5D are diagrams illustrating examples of detection of a foreground from an input image.

FIG. 6 is a illustrating an example of showing a method of detecting a foreground.

FIG. 7 is a diagram illustrating an example of a method of detecting a foreground.

FIGS. 8A-8E are a diagrams illustrating examples of detection of a foreground through recognition of a foreground speed.

FIGS. 9A-9E are diagrams illustrating examples of detection of a foreground through recognition of an illumination change.

Throughout the drawings and the detailed description, unless otherwise described, the same drawing reference numerals should be understood to refer to the same elements, features, and structures. The relative size and depiction of these elements may be exaggerated for clarity, illustration, and convenience.

DETAILED DESCRIPTION

The following detailed description is provided to assist the reader in gaining a comprehensive understanding of the methods, apparatuses, and/or systems described herein. However, various changes, modifications, and equivalents of the methods, apparatuses, and/or systems described herein will be apparent after an understanding of the disclosure of this application. For example, the sequences of operations described herein are merely examples, and are not limited to those set forth herein, but may be changed as will be apparent after an understanding of the disclosure of this application, with the exception of operations necessarily occurring in a certain order. Also, descriptions of features that are known in the art may be omitted for increased clarity and conciseness.

The features described herein may be embodied in different forms, and are not to be construed as being limited to the examples described herein. Rather, the examples described herein have been provided merely to illustrate some of the many possible ways of implementing the methods, apparatuses, and/or systems described herein that will be apparent after an understanding of the disclosure of this application.

FIG. 1 is a diagram illustrating an example of a foreground detecting apparatus.

A foreground detecting apparatus 100 detects a foreground by processing an input image, and the following description will be made in relation to an example in which the foreground detecting apparatus 100 detects a foreground 20 from an image input 10, which is input from a moving camera. However, the following disclosure is not limited thereto, and may be applied to other examples, such as an example where a foreground is detected from an image acquired from a static camera. In this case, the image input to the foreground detecting apparatus 100 is input in units of frames, and for the convenience sake of description, an image of a frame input at the current time point is referred to as a current frame, and an image of a frame input at an previous time point is referred to as a previous frame. In an example, the previous frame may be a frame input at an immediately previous time point and is referred to as an immediately previous frame.

Referring to FIG. 1, the foreground detecting apparatus 100 includes a context information estimator 110, a background model constructor 120, and a foreground detector 130.

When a current frame is input from a camera, the context information estimator 110 estimates context information on a scene of the current frame. In an example, the context information may include background motion occurring due to a movement of the camera, a foreground speed representing the speed of a moving object, and an illumination change representing an overall change in brightness of the scene. However, the context information is not limited thereto, and various other types of context information, such as, for example, time that the current frame is captured, location of the current frame, are considered to be well within the scope of the present disclosure.

For example, assuming that a background takes up a higher proportion than a foreground in an image, the context information estimator 110 may obtain motion information at each location in the input current frame, remove an outlier, and estimate background motion. In another example, the context information estimator 110 may determine a location that does not match the background motion estimated at each location of the current frame as a foreground, measure the speed of the location determined as the foreground, and estimate a foreground speed based on a relative speed obtained by subtracting a background speed from the measured speed. In addition, a viewing area of a moving camera may change significantly, causing an abrupt change in the overall brightness of an image. In order to reflect such a change in the brightness, the context information estimator 110 may estimate an illumination change.

The background model constructor 120 may construct a background model for the current frame by reflecting the estimated context information, for example, the background motion, the foreground speed, and the illumination change information. In an example, the background model includes a mean and a variance for a background, and is constructed in units of blocks having a size that includes a plurality of pixels, rather than being constructed in units of pixels. In an example, the units of blocks has a predetermined size.

The background model constructor 120 may construct a background model for a current frame by correcting a background model of a previous frame based on the estimated context information, which will be described in further detail later with reference to FIG. 2A. In an example, the previous frame is an immediately previous frame. For example, since a background does not move in a block unit, the background model constructor 120 may generate a background model for the current frame based on a background model of adjacent blocks or adjacent pixels of the immediately previous frame in consideration of the estimated background motion information. In an example, when a foreground speed is high, image matching may be determined to be poor due to edges, and in order to reduce a false alarm, an additional correction may be performed to increase variance of the generated background model.

After the background model for the current frame is generated as described above, the background model constructor 120 may update the background model by reflecting the estimated illumination change. In an example, in order to prevent contamination of the background by a foreground, the background model may be updated differently according to the foreground speed. For example, the background model constructor 120 may define a context variable based on a foreground speed, compare the context variable with a block size, and when a value of the context variable is smaller than the block size, update the mean by reflecting the illumination change, and not update the variance. In another example, when a value of the context variable is equal to or larger than the block size, the mean and the variance may be updated by reflecting the illumination change and/or additional information.

The foreground detector 130 may detect a foreground from the current frame using the background model constructed by the background model constructor 120, and output a foreground map 20. In an example, the foreground detector 130 does not independently determine the foreground on the basis of a pixel. The foreground detector 130 obtains a foreground probability map using the constructed background model and performs propagation from a region having a high probability to nearby foreground candidate areas indicating a foreground.

FIGS. 2A to 2C are diagram illustrating parts 110, 120 and 130 of the foreground detecting apparatus 100 of FIG. 1. The foreground detecting apparatus 100 will be described in detail with reference to FIGS. 2A to 2C.

FIG. 2A shows an example of the context information estimator 110 shown in FIG. 1. In an example, the context information estimator 110 includes a background motion estimator 111, which estimates background motion, a foreground speed estimator 112, which measures a foreground speed, and an illumination change estimator 113, which estimates an illumination change.

The background motion estimator 111 may estimate background motion using a current frame and a previous frame. Generally, an image captured by a moving camera has motion even in a background, so the background motion needs be considered to precisely detect a foreground. In an example, assuming that a background takes up a higher proportion than a foreground in an image, the background motion estimator 111, obtains a speed at each location, removes an outlier, and estimates the background motion by expressing a projective model that covers as wide an area as possible.

For example, the background motion estimator 111 may generate a projective transform matrix H_(t:t-1) at represents background motion using Equation 1 below, and use the generated projective transform matrix H_(t:t-1) as background motion information.

Equation 1 I ^((t))(x _(i) +u _(i) ,y _(i) +v _(i))=I ^((t-1))(x _(i) ,y _(i))  (a) X _(i) ^((t-1))=(x _(i) ,y _(i),1)^(T) ,X _(i) ^((t))=(x _(i) +u _(i) ,y _(i) +v _(i),1)^(T)  (b) [X ₁ ^((t)) , X ₂ ^((t)), . . . ]=H _(t:t-1)[X ₁ ^((t-1)) ,X ₂ ^((t-1)), . . . ]  (c)

Here, X_(i) ^((t-1)) and X_(i) ^((t)) represent coordinates of an i^(th) block of an immediately previous frame and coordinates of a current frame mapping the i^(th) block, respectively.

Referring to Equation 1, when a current frame is input at a time point t, the background motion estimator 111 converts the current frame into gray scale to obtain a converted current frame I^((t)), and divides an immediately previous frame I^((t-1)) input at an immediately previous time point t-1 into a plurality of blocks. Then, the background motion estimator 111 may find a region of the current frame I^((t)) that matches a center location (x_(i), y_(i)) of the i^(th) block of the immediately previous frame. In this case, based on an assumption in optical flow that the same brightness intensity is kept at a matching location, the background motion estimator 111 may obtain a speed of the i^(th) block (u_(i), v_(i)) satisfying Equation 1(a) above. The background motion estimator 111 may represent a relationship between the immediately previous frame and the current frame using the speed of the block as shown in Equation 1(b) above. The background motion estimator 111 may obtain the projective transform matrix H_(t:t-1) satisfying Equation 1(c) above, using the relationship. In an example, in order to effectively represent the motion of the entire scene, outliers may be removed.

When the projective transform matrix H_(t:t-1) is generated, the foreground speed estimator 112 may obtain a foreground speed associated with foreground motion using the projective transform matrix H_(t:t-1). For example, motion of the block that does not satisfy X_(i) ^((t))=H_(t:1-1)X_(i) ^((t-1)) may be determined to be the foreground motion, and accordingly, Equation 2(a) below, a corrected speed (û_(i), {circumflex over (v)}_(i)) may be obtained. In an example, a block in which the background is positioned has a corrected speed of 0 so only a speed of a region in which a foreground is positioned remains to obtain the speed of a foreground portion. By considering a possibility of noise existing when a block speed calculated by Equation 1(a) is not precise, the foreground speed estimator 112 may estimate an average foreground speed using the foreground speed at each location spanning foreground pixels of an immediately previous frame as shown of Equation 2(b).

$\begin{matrix} {{{(a)\mspace{14mu}\left( {{\hat{u}}_{i},{\hat{v}}_{i},1} \right)} = {{X_{i}^{(t)} - {H_{t:{t - 1}}{X_{i}^{({t - 1})}(b)}\mspace{14mu} s^{(t)}}} = {\frac{1}{P}{\sum\limits_{p \in {FG}^{(t)}}^{\;}\sqrt{{\hat{u}}_{p}^{2} + {\hat{v}}_{p}^{2}}}}}},} & {{Equation}\mspace{14mu} 2} \end{matrix}$

Here, FG^((t)) represents a foreground area of an immediately previous frame, p represents the location of a pixel in the foreground area, and P represents the number of pixels in the foreground area.

FIGS. 3A-3C are diagrams illustrating examples of estimation of a foreground speed. FIG. 3A shows an input image, and using Equation 2, background motion is corrected is the input image and then a corrected speed of a foreground portion, i.e., a speed of a bicycle moving rightward, is obtained as in FIG. 3B. FIG. 3C shows a foreground extracted using the speed of the bicycle obtained in FIG. 3B.

Referring again to FIG. 2A, the illumination change estimator 113 may measure the overall brightness change of a scene when a current frame is input. For example, the illumination change estimator 113 may estimate an illumination change using a difference between a mean of brightness intensities of a current frame I_(j) ^(t) and a mean of means {tilde over (μ)}_(i) ^((t)) of background models. Although the background may be implemented using a background model of the current frame, the present disclosure is not limited thereto. In an example, background model of an immediately previous frame may be used.

$\begin{matrix} {b^{(t)} = {{\frac{1}{N}{\sum\limits_{j = 1}^{N}I_{j}^{(t)}}} - {\frac{1}{M}{\sum\limits_{i = 1}^{M}{\overset{\sim}{\mu}}_{i}^{(t)}}}}} & {{Equation}\mspace{14mu} 3} \end{matrix}$

FIG. 2B is a diagram illustrating an example of the background model constructor 120 of FIG. 1. In an example, the background model constructor 120 includes a background model generator 121 and a background model updater 122.

The background model generator 121 may generate a background model for a current image frame using the estimated context information. The background model generator 121 may generate a background model of a current frame by correcting a mean and a variance of a background model of an immediately previous frame to match a location of the current frame using background motion information. The background motion information represents a position relationship in mapping between the immediately previous frame and the current frame, from among the estimated context information.

For example, the background model generator 121 may generate a background model of a current frame using Equation 4 below. The background model generator 121 may map the location of the current frame to the immediately previous frame using an inverse matrix of the projective transform matrix H_(t:t-1) that represents the background motion as in Equation 4(a). A first location of the current frame may be mapped to a second location that is not a central position of a block of the immediately previous frame. In an example, the background model generator 121 may determine a plurality of locations around the second location, and generate a background model for the first location of the current frame using background models of the plurality of locations.

FIG. 4 is a diagram illustrating an example of generation of a background model with respect to an input image. In an example, the background model generator 121 uses a bilinear interpolation method as shown in FIG. 4. Assuming that coordinates of the second location corresponding to the first location are (x,y), the background model generator 121 may determine locations P1, P2, P3, and P4 that satisfy a set of integers {(└x┘, └y┘), (└x┘, ┌y┐), (┌x┐, └y┘), (┌x┐, ┌y┐)}. For example, when the first location has coordinates (10,10) and the mapped second location has coordinates (5.3,4.7), the determined plurality of integer locations may be {(5,4), (5,5), (6,4), (6,5)}. Using background models (μ₁, σ₁), (μ₂, σ₂), (μ₃, σ₃), and (μ₄, σ₄) of the determined locations P1, P2, P3, and P4, a background model ({tilde over (μ)}_(i), {tilde over (σ)}_(i)) of the second location is generated.

In other words, using (b) and (c) of Equation 4, a mean μ_(k) ^((t-1)) and a variance σ_(k) ^((t-1)) of a background model at each location of the immediately previous frame is subject to weighted summation, thereby generating a background model ({tilde over (μ)}_(i) ^((t-1)), {tilde over (σ)}_(i) ^((t-1))) of the second location of the immediately previous frame. The background model of the second location may be constructed as a background model for the first location of the current frame by the background model updater 122 reflecting an illumination change.

$\begin{matrix} {{(a)\mspace{14mu} X_{i}^{({t - 1})}} = {{H_{t:{t - 1}}^{- 1}{X_{i}^{(t)}(b)}\mspace{14mu}{\overset{\sim}{\mu}}_{i}^{({t - 1})}} = {{\sum\limits_{k \in R_{i}}^{\;}{w_{k}{\mu_{k}^{({t - 1})}(c)}\mspace{14mu}{\overset{\sim}{\sigma}}_{i}^{({t - 1})}}} = {\sum\limits_{k \in R_{i}}^{\;}{w_{k}\sigma_{k}^{({t - 1})}}}}}} & {\;{{Equation}\mspace{14mu} 4}\;} \end{matrix}$

In this case, a weight value may be determined by an area formed by a second location Ri and each integer location P1, P2, P3, and P4 as illustrated in FIG. 4, and may be calculated according to Equation 5 below.

$\begin{matrix} {{w_{1} = \frac{\left( {\left\lceil x \right\rceil - x} \right)\left( {\left\lceil y \right\rceil - y} \right)}{\left( {\left\lceil x \right\rceil - \left\lfloor x \right\rfloor} \right)\left( {\left\lceil y \right\rceil - \left\lfloor y \right\rfloor} \right)}}{w_{2} = \frac{\left( {x - \left\lfloor x \right\rfloor} \right)\left( {\left\lceil y \right\rceil - y} \right)}{\left( {\left\lceil x \right\rceil - \left\lfloor x \right\rfloor} \right)\left( {\left\lceil y \right\rceil - \left\lfloor y \right\rfloor} \right)}}{w_{3} = \frac{\left( {\left\lceil x \right\rceil - x} \right)\left( {y - \left\lfloor y \right\rfloor} \right)}{\left( {\left\lceil x \right\rceil - \left\lfloor x \right\rfloor} \right)\left( {\left\lceil y \right\rceil - \left\lfloor y \right\rfloor} \right)}}{w_{4} = \frac{\left( {x - \left\lfloor x \right\rfloor} \right)\left( {y - \left\lfloor y \right\rfloor} \right)}{\left( {\left\lceil x \right\rceil - \left\lfloor x \right\rfloor} \right)\left( {\left\lceil y \right\rceil - \left\lfloor y \right\rfloor} \right)}}} & {{Equation}\mspace{14mu} 5} \end{matrix}$

After the background model of the second location is generated, the background model generator 121 may additionally correct the variance by reflecting a foreground speed on the variance. For example, the background model generator 121 defines a counting variable and a context variable, initially sets the counting variable c^((t)) to 1, and multiplies the counting variable c^((t)) by a foreground speed s^((t)) such that a result of the multiplication is set as a value of a context variable value c^((t))·s^((t)). When the context variable has a value smaller than a block size B as in Equation 6 below, the counting variable is increased by 1, and otherwise, the counting variable is reset to 1.

$\begin{matrix} {c^{({t + 1})} = \left\{ \begin{matrix} {{c^{(t)} + 1},} & {{{if}\mspace{11mu}{c^{(t)} \cdot s^{(t)}}} < B} \\ 1 & \mspace{11mu} \end{matrix} \right.} & {{Equation}\mspace{14mu} 6} \end{matrix}$

The background model generator 121 maintains the variance when the foreground speed is slow, i.e., the context variable value is smaller than the block size as shown in Equation 7 below. Otherwise, squares of differences between a mean of the background model generated for the second location and means of the background models of the respective locations are subject to weighted summation, and the result is added to the variance of the background model generated for the second location, thereby additionally correcting the variance by reflecting the foreground speed.

$\begin{matrix} {{\overset{\sim}{\sigma}}_{i}^{({t - 1})} = \left\{ \begin{matrix} {{\overset{\sim}{\sigma}}_{i}^{({t - 1})},{{{if}\mspace{11mu}{c^{(t)} \cdot s^{(t)}}} < B}} \\ {{\overset{\sim}{\sigma}}_{i}^{({t - 1})} + {\sum\limits_{k \in R_{i}}^{\;}{w_{k}\left( {{\overset{\sim}{\mu}}_{i}^{({t - 1})} - \mu_{k}^{({t - 1})}} \right)}^{2}}} \end{matrix} \right.} & {{Equation}\mspace{14mu} 7} \end{matrix}$

The background model updater 122 may update the generated background model by reflecting the foreground speed s^((t)) and the illumination change b^((t)). For example, the background model updater 122 may update the mean and the variance of the background model differently according to the foreground speed.

$\begin{matrix} {\mu_{i}^{(t)} = \left\{ {{\begin{matrix} {{\overset{\sim}{\mu}}_{i}^{({t - 1})} + b^{(t)}} & {{{if}\mspace{11mu}{c^{(t)} \cdot s^{(t)}}} < B} \\ {{\frac{\alpha_{i}^{({t - 1})}}{\alpha_{i}^{({t - 1})} + 1}\left( {{\overset{\sim}{\mu}}_{i}^{({t - 1})} + b^{(t)}} \right)} + {\frac{1}{\alpha_{i}^{({t - 1})} + 1}M_{i}^{(t)}}} & \; \end{matrix}\;\sigma_{i}^{(t)}} = \left\{ \begin{matrix} \sigma_{i}^{({t - 1})} & {{{if}\mspace{11mu}{c^{(t)} \cdot s^{(t)}}} < B} \\ {{\frac{\alpha_{i}^{({t - 1})}}{\alpha_{i}^{({t - 1})} + 1}\sigma_{i}^{({t - 1})}} + {\frac{1}{\alpha_{i}^{({t - 1})} + 1}V_{i}^{(t)}}} & \; \end{matrix} \right.} \right.} & {{Equation}\mspace{14mu} 8} \end{matrix}$

In an example, the background model updater 122 compares the context variable value defined as above with a block size, and if the context variable value is smaller than the block size, update the mean by reflecting an illumination change and maintains the variance. In another example, if the context variable value is equal to or larger than the block size, the background model updater 122 updates the mean and the variance by reflecting additional information thereon. In this case, the background model updater 122 may update the mean by reflecting additional information, i.e., a value M_(i) ^((t)) defined as a block brightness intensity average, together with the illumination change, and update the variance by reflecting additional information, i.e., a maximum value V_(i) ^((t)) of a square of a difference between the mean of the background model and an input image, according to Equation 9 below.

$\begin{matrix} {{M_{i}^{(t)} = {\frac{1}{B_{i}}{\sum\limits_{j \in B_{i}}^{\;}I_{j}^{(t)}}}}{V_{i}^{(t)} = {\max\limits_{j \in B_{i}}\left( {\mu_{i}^{(t)} - I_{j}^{(t)}} \right)^{2}}}} & {{Equation}\mspace{14mu} 9} \end{matrix}$

In an example, the background model updater 122 may define a time-variant variable α_(i) ^((t)) that represents a frequency at which the block appears on scenes from a first frame to the current frame due to camera motion as shown in Equation 10. The background model updater 122 may adjust an update intensity of the mean and the variance using the time-variant variable. In an example, the time-variant variable has a value of 1 for a block which appears for the first time on the scene. As such, using the time-variant variable to adjust the update intensity causes an area that appears for the first time on the scene to be updated at a significantly high speed, and an area that has appeared for a long time to be updated at a low speed. In an example, when the time-variant variable is a predetermined value or above, an update may not be performed. In order to prevent such a constraint, the time-variant variable may have a maximum value α_(max). α_(i) ^((t))={tilde over (α)}_(i) ^((t-1))+1  Equation 10 FIG. 2C is a diagram illustrating an example of the foreground detector 130, which includes a foreground probability map generator 131 and a foreground extractor 132.

The foreground probability map generator 131 may generate a foreground probability map representing a probability that an input pixel corresponds to a foreground using the background model (μ^((t)), σ^((t))) for the current frame. For example, the foreground probability map generator 131 may generate a foreground probability map f_(FG)(j) representing a probability that a j^(th) pixel in the i^(th) block corresponds to the foreground through Equation 11.

$\begin{matrix} {{f_{FG}(j)} = \frac{\left( {I_{j}^{(t)} - \mu_{i}^{(t)}} \right)^{2}}{\sigma_{i}^{(t)}}} & {{Equation}\mspace{14mu} 11} \end{matrix}$

The foreground extractor 132 may extract the foreground from the current image frame using the generated foreground probability map. For example, the foreground extractor 132 may generate a classification result L_(init)(j) among a foreground, a candidate, and a background by comparing a foreground probability value with respect to the j^(th) pixel with preset threshold values T_(low) _(t) and T_(high) through Equation 12 below. When a pixel is classified into a candidate, the classification result L_(init)(j) is used as an input for watershed segmentation such that the pixel is finally classified into the foreground or the background.

$\begin{matrix} {{L_{init}(j)} = \left\{ \begin{matrix} {Background} & {{{if}\mspace{14mu}{f_{FG}(j)}} \leq T_{low}} \\ {Candidate} & {{{if}{\mspace{11mu}\;}T_{low}} \leq {f_{FG}(j)} < T_{high}} \\ {Foreground} & {{{if}\mspace{14mu} T_{high}} \leq {f_{FG}(j)}} \end{matrix} \right.} & {{Equation}\mspace{14mu} 12} \end{matrix}$

FIGS. 5A-5D are diagrams illustrating examples of detection a foreground from an input image.

FIG. 5A represents an input current frame, FIG. 5B is a foreground probability map generated by the foreground probability map generator 131, and FIG. 5C is a result, which is classified by the foreground extractor 132 using the foreground probability map as a background, a foreground, and a candidate area. In FIG. 5D, a white area represents a foreground area, a black area represents the candidate area, and a gray area represents a background area. In an example, the candidate area may include a noise portion or a portion of the foreground area that has a foreground probability reduced due to its color being similar to the background. The candidate area is classified into the foreground area (the white area) and the background area (the black area) using the watershed segmentation method.

FIG. 6 is a diagram illustrating an example of a method of detecting a foreground. The operations in FIG. 6 may be performed in the sequence and manner as shown, although the order of some operations may be changed or some of the operations omitted without departing from the spirit and scope of the illustrative examples described. Many of the operations shown in FIG. 6 may be performed in parallel or concurrently.

The method of detecting a foreground according to an example described in FIG. 6 is performed by the foreground detecting apparatus 100. The method of detecting a foreground performed by the foreground detecting apparatus 100 should be understood as described with reference to FIGS. 1 to 5. In addition to the description of FIG. 6 below, the above descriptions of FIGS. 1-5, are also applicable to FIG. 6, and are incorporated herein by reference. Thus, the above description may not be repeated here.

Referring to FIG. 6, in 610, when a current frame of an image is input, the foreground detecting apparatus 100 may estimate context information representing a state of a scene of the current frame. In an example, the context information may include background motion, a foreground speed, and illumination change information, but the context information is not limited thereto. For example, the background motion may be obtained by obtaining motion information at each location in the current frame, removing an outlier, and representing the motion information as a projective transform matrix. In addition, the foreground detecting apparatus 100 may determine a location that does not match the background motion estimated at each location of the current frame as a foreground location using the projective transform matrix, and foreground detecting apparatus 100 may estimate a speed of the determined foreground location. A moving camera has a viewing area that may change significantly causing an abrupt change in overall brightness of an image. The illumination change may be used as information, which reflects a change in the brightness of the image.

In 620, the foreground detecting apparatus 100 may construct a background model for the current frame by reflecting the estimated context information, for example, the background motion, the foreground speed, and the illumination change information.

For example, since a background does not move in units of blocks, the background model constructor 120 may generate the background model for the current frame based on a background model of adjacent blocks or adjacent pixels of an immediately previous frame in consideration of the estimated background motion information. In an example, when a foreground speed is high, image matching may be determined to be poor due to edges, and in order to reduce a false alarm, an additional correction may be performed to increase a variance of the generated background model.

After the background model for the current frame is generated, the background model may be updated by reflecting the estimated illumination change. In an example, in order to prevent contamination of the background by a foreground, the background model may be updated differently according to the foreground speed. For example, when a value of a context variable defined based on the foreground speed is smaller than a block size, a mean is updated by reflecting the illumination change thereon, and the variance is not updated. In another example, when the value of the context variable is equal to or larger than the block size, the mean and the variance are updated by reflecting the illumination change and/or additional information.

In 630, after the background model for the current frame is constructed, a foreground may be detected from the current frame using the constructed background model. For example, a foreground probability map is generated using the background model, and using the foreground probability map, propagation is performed from a region having a high probability to nearby foreground candidate areas while the foreground is indicated. In an example, a lower limit and an upper limit of a threshold value are set, and a propagation probability value is compared with the lower limit and the upper limit of the threshold value so that a foreground area, a candidate area and a background area are classified, and the candidate area is subject to watershed segmentation so as to be classified into the background area or the foreground area.

FIG. 7 is a diagram illustrating an example of a method of detecting a foreground. The operations in FIG. 7 may be performed in the sequence and manner as shown, although the order of some operations may be changed or some of the operations omitted without departing from the spirit and scope of the illustrative examples described. Many of the operations shown in FIG. 7 may be performed in parallel or concurrently.

The method of detecting a foreground according to an example described in FIG. 7 is performed by the foreground detecting apparatus 100. In addition to the description of FIG. 7 below, the above descriptions of FIGS. 1-6, are also applicable to FIG. 7, and are incorporated herein by reference. Thus, the above description may not be repeated here.

In 711, when a current frame is input from a camera, the foreground detecting apparatus 100 may estimate background motion by comparing the current frame with an immediately previous frame. Assuming that a background generally takes up a higher proportion than a foreground in an image, the foreground detecting apparatus 100 may obtain a speed at each location in units of blocks, remove an outlier, and generate a projective transform matrix that represents the background motion.

In 712, after the projective transform matrix representing the background motion is generated, a foreground speed may be estimated based on the projective transform matrix. For example, when the immediately previous frame is mapped to the current frame using the projective transform matrix, a location that does not properly satisfy the mapping is determined as a location of the foreground. By correcting each location calculated in the estimation of the background motion, the foreground speed is estimated. In an example, if a block speed is not precisely calculated in the estimation of the background motion, noise may exist. Accordingly, using only the foreground speeds of pixels in a foreground area of the immediately previous frame, the average foreground speed is estimated.

In 713, in order to reflect the foreground speed on the construction of a background model, a context variable may be set based on the foreground speed. For example, a counting variable is set to 1, and the counting variable is multiplied by the foreground speed to set a value of a context variable value.

In 714, an illumination change may be estimated based on the background model. In an example, using a difference between a mean of brightness intensities of a current frame and a mean of means of background models, the illumination change may be estimated.

In 721, after the projective transform model representing the background motion is generated, the foreground detecting apparatus 100 may generate a background model for the current frame using background motion information. For example, a certain location of the current frame is mapped to an immediately previous frame using the projective transform model, integer locations around the mapped location of the immediately previous frame are determined using bilinear interpolation, and background models of the determined plurality of locations are subject to weighted summation, thereby generating a background model. The background model that is generated is subject to an update process to be constructed as a background model of the current frame.

In 722, the foreground detecting apparatus 100 may compare the context variable set in operation 713 with a block size in order to differently update the background model in consideration of the foreground speed.

In 723, when it is determined as a result of the comparison of operation 722 that the context variable is equal to or larger than the block size, the variance of the background model generated in operation 721 is additionally corrected. In an example, squares of differences between a mean of the generated background model and means of the background models of the respective locations are subject to weighted summation. The result of the weighted summation is added to a variance of the background model so that the probability of a foreground is calculated at a lower probability in foreground extraction, thereby reducing a false alarm.

In 724, the background model may be updated by reflecting illumination change and additional information on the background model. For example, as shown in Equation 8, the mean of the background model corrected in operation 723 may be updated by reflecting an illumination change and additional information, which is a value defined as a block brightness intensity average, thereon, and the variance of the background model may be updated by reflecting additional information, which is a maximum value of a square of a difference between the mean of the background model and an input image, thereon. In an example, by reflecting a time-variant variable calculated by Equation 10 on the update of the mean and the variance, an update intensity may be adjusted. As such, a block that has appeared for the first time on the scene is updated at a high speed, and a block that has appeared for a long time is updated at a low speed.

In 725, the counting variable is reset to 1 for the next frame.

In 726, when it is determined as a result of the comparison of operation 722 that the context variable is smaller than the block size, the background model may be updated by reflecting the illumination change thereon the additional correction of the generated background model. In an example, as shown in Equation 8, the mean of the background model generated in operation 721 is updated by adding the illumination change to the mean, and the variance of the background model is maintained.

In 727, the counting variable is increased by 1 for the next frame. The increase of the counting variable prevents the background model from not being updated for a long time when the foreground speed is always slow compared to the block size.

In 731, the foreground detecting apparatus 100 may generate a foreground probability map representing a probability that each pixel corresponds to the foreground using the updated background model.

In 732, the foreground detecting apparatus 100 may extract the foreground from the current frame using the generated foreground probability map. For example, a lower limit and an upper limit of a probability threshold are set, and the foreground probability of each pixel is compared with the lower limit and the upper limit so that each pixel is classified into one of the foreground, a candidate, and the background. The pixel classified into the candidate is subject to watershed segmentation to be classified into one of a background area or a foreground area.

FIGS. 8A-8E are diagrams illustrating detection of a foreground through recognition of a foreground speed.

FIG. 8A represents an input image in which a foreground speed is slow, FIGS. 8B AND 8C represent a background and a foreground extracted using a general foreground detecting method, respectively. It can be seen a background model has a foreground introduced thereto in FIG. 8B so that the foreground is not properly detected in FIG. 8C even if the background is relatively simple. In contrast, FIGS. 8D and 8E represent results detected using the above-described examples, in which the background is not contaminated by the foreground in FIG. 8D so that the foreground is precisely detected in FIG. 8E.

FIGS. 9A-9E are a diagram illustrating detection of a foreground through recognition of an illumination change.

FIG. 9A represents an input image in which a brightness of a scene is greatly changed by an auto-exposure control function when a camera moves abruptly in up and down directions. FIGS. 9B and 9 represent a background and a foreground detected without the illumination change reflected thereon. FIGS. 9D and 9E represent a background and a foreground detected with the illumination change reflected thereon. In the case when the brightness of a scene of an input image is greatly changed, ignorance of the illumination change may cause a boundary of the brightened area on the scene to be indicated as shown in the background image of FIG. 9B, thereby causing the background image to be broken. In contrast, when the illumination change information is reflected in accordance with the examples described above, an overall brightness change of the background is corrected as in FIG. 9D, so that a smooth background model is constructed. Thereby, obtaining a foreground area robust to an illumination change as in FIG. 9E.

In an example, the foreground detecting apparatus 100 may be embedded in or interoperate with a camera. In an example, the camera may be embedded in or interoperate with various digital devices such as, for example, a mobile phone, a cellular phone, a smart phone, a wearable smart device (such as, for example, a ring, a watch, a pair of glasses, glasses-type device, a bracelet, an ankle bracket, a belt, a necklace, an earring, a headband, a helmet, a device embedded in the cloths), a personal computer (PC), a laptop, a notebook, a subnotebook, a netbook, or an ultra-mobile PC (UMPC), a tablet personal computer (tablet), a phablet, a mobile internet device (MID), a personal digital assistant (PDA), an enterprise digital assistant (EDA), a digital camera, a digital video camera, a portable game console, an MP3 player, a portable/personal multimedia player (PMP), a handheld e-book, an ultra mobile personal computer (UMPC), a portable lab-top PC, a global positioning system (GPS) navigation, a personal navigation device or portable navigation device (PND), a handheld game console, an e-book, and devices such as a high definition television (HDTV), an optical disc player, a DVD player, a Blue-ray player, a setup box, robot cleaners, a home appliance, content players, communication systems, image processing systems, graphics processing systems, other consumer electronics/information technology (CE/IT) device, or any other device capable of wireless communication or network communication consistent with that disclosed herein. The digital devices may be implemented in a smart appliance, an electric vehicle, an intelligent vehicle, or in a smart home system.

The digital devices may also be implemented as a wearable device, which is worn on a body of a user. In one example, a wearable device may be self-mountable on the body of the user, such as, for example, a watch, a bracelet, or as an eye glass display (EGD), which includes one-eyed glass or two-eyed glasses. In another non-exhaustive example, the wearable device may be mounted on the body of the user through an attaching device, such as, for example, attaching a smart phone or a tablet to the arm of a user using an armband, incorporating the wearable device in a cloth of the user, or hanging the wearable device around the neck of a user using a lanyard.

The foreground detecting apparatus 100, context information estimator 110, background model constructor 120, foreground detector 130, background motion estimator 111, foreground speed estimator 112, illumination change estimator 113, background model generator 121, background model updater 122, foreground probability map generator 131, and foreground extractor 132 described in FIGS. 1 and 2A-2C that perform the operations described in this application are implemented by hardware components configured to perform the operations described in this application that are performed by the hardware components. Examples of hardware components that may be used to perform the operations described in this application where appropriate include controllers, sensors, generators, drivers, memories, comparators, arithmetic logic units, adders, subtractors, multipliers, dividers, integrators, and any other electronic components configured to perform the operations described in this application. In other examples, one or more of the hardware components that perform the operations described in this application are implemented by computing hardware, for example, by one or more processors or computers. A processor or computer may be implemented by one or more processing elements, such as an array of logic gates, a controller and an arithmetic logic unit, a digital signal processor, a microcomputer, a programmable logic controller, a field-programmable gate array, a programmable logic array, a microprocessor, or any other device or combination of devices that is configured to respond to and execute instructions in a defined manner to achieve a desired result. In one example, a processor or computer includes, or is connected to, one or more memories storing instructions or software that are executed by the processor or computer. Hardware components implemented by a processor or computer may execute instructions or software, such as an operating system (OS) and one or more software applications that run on the OS, to perform the operations described in this application. The hardware components may also access, manipulate, process, create, and store data in response to execution of the instructions or software. For simplicity, the singular term “processor” or “computer” may be used in the description of the examples described in this application, but in other examples multiple processors or computers may be used, or a processor or computer may include multiple processing elements, or multiple types of processing elements, or both. For example, a single hardware component or two or more hardware components may be implemented by a single processor, or two or more processors, or a processor and a controller. One or more hardware components may be implemented by one or more processors, or a processor and a controller, and one or more other hardware components may be implemented by one or more other processors, or another processor and another controller. One or more processors, or a processor and a controller, may implement a single hardware component, or two or more hardware components. A hardware component may have any one or more of different processing configurations, examples of which include a single processor, independent processors, parallel processors, single-instruction single-data (SISD) multiprocessing, single-instruction multiple-data (SIMD) multiprocessing, multiple-instruction single-data (MISD) multiprocessing, and multiple-instruction multiple-data (MIMD) multiprocessing.

The methods illustrated in FIGS. 6 and 7 that perform the operations described in this application are performed by computing hardware, for example, by one or more processors or computers, implemented as described above executing instructions or software to perform the operations described in this application that are performed by the methods. For example, a single operation or two or more operations may be performed by a single processor, or two or more processors, or a processor and a controller. One or more operations may be performed by one or more processors, or a processor and a controller, and one or more other operations may be performed by one or more other processors, or another processor and another controller. One or more processors, or a processor and a controller, may perform a single operation, or two or more operations.

The instructions or software to control computing hardware, for example, one or more processors or computers, to implement the hardware components and perform the methods as described above, and any associated data, data files, and data structures, may be recorded, stored, or fixed in or on one or more non-transitory computer-readable storage media. Examples of a non-transitory computer-readable storage medium include read-only memory (ROM), random-access memory (RAM), flash memory, CD-ROMs, CD-Rs, CD+Rs, CD-RWs, CD+RWs, DVD-ROMs, DVD-Rs, DVD+Rs, DVD-RWs, DVD+RWs, DVD-RAMs, BD-ROMs, BD-Rs, BD-R LTHs, BD-REs, magnetic tapes, floppy disks, magneto-optical data storage devices, optical data storage devices, hard disks, solid-state disks, and any other device that is configured to store the instructions or software and any associated data, data files, and data structures in a non-transitory manner and provide the instructions or software and any associated data, data files, and data structures to one or more processors or computers so that the one or more processors or computers can execute the instructions. In one example, the instructions or software and any associated data, data files, and data structures are distributed over network-coupled computer systems so that the instructions and software and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by the one or more processors or computers.

While this disclosure includes specific examples, it will be apparent after an understanding of the disclosure of this application that various changes in form and details may be made in these examples without departing from the spirit and scope of the claims and their equivalents. The examples described herein are to be considered in a descriptive sense only, and not for purposes of limitation. Descriptions of features or aspects in each example are to be considered as being applicable to similar features or aspects in other examples. Suitable results may be achieved if the described techniques are performed in a different order, and/or if components in a described system, architecture, device, or circuit are combined in a different manner, and/or replaced or supplemented by other components or their equivalents. Therefore, the scope of the disclosure is defined not by the detailed description, but by the claims and their equivalents, and all variations within the scope of the claims and their equivalents are to be construed as being included in the disclosure. 

What is claimed is:
 1. An apparatus for detecting a foreground in an image, the apparatus comprising: a memory configured to store instructions; and a processor configured to execute the instructions to: estimate context information, including a foreground speed, on a scene from an image frame of the image; construct a background model of the image frame using the estimated context information; and detect a foreground from the image frame based on the constructed background model, wherein the foreground speed is estimated by calculating a speed using an optical flow of the image frame and a previous image frame, and correcting the speed based on a projective transform matrix that represents background motion.
 2. The apparatus of claim 1, wherein the context information further includes an illumination change.
 3. The apparatus of claim 2, wherein the processor is further configured to execute the instructions to: generate the projective transform matrix using the speed; and estimate the illumination change based on a difference between a mean of brightness intensities of the image frame and a mean of brightness intensities of the background model.
 4. The apparatus of claim 3, wherein the previous image frame comprises an immediately previous image frame.
 5. The apparatus of claim 2, wherein for the construction of the background model, the processor is further configured to execute the instructions to generate the background model of the image frame using a background model of the previous image frame and the background motion.
 6. The apparatus of claim 5, wherein for the construction of the background model, the processor is further configured to execute the instructions to determine locations on the previous image frame based on a location of the previous image frame which corresponds to a first location of the image frame, and to generate a background model of the image frame for the first location by performing weighted summation on background models of the previous image frame for the determined locations.
 7. The apparatus of claim 6, wherein for the construction of the background model, the processor is further configured to execute the instructions to correct a variance of the generated background model, in response to a value of a context variable based on the foreground speed being greater than or equal to a block size.
 8. The apparatus of claim 7, wherein for the construction of the background model, the processor is further configured to execute the instructions to correct a variance of the background model of the first location by reflecting differences between a mean of the background model of the first location and means of the background models at the determined locations.
 9. The apparatus of claim 5, wherein for the construction of the background model, the processor is further configured to execute the instructions to update the generated background model based on the illumination change.
 10. The apparatus of claim 9, wherein for the updating of the generated background model, the processor is further configured to execute the instructions to update a mean of the generated background model by reflecting the illumination change, in response to value of a context variable being smaller than a block size, and to update the mean and a variance of the generated background model by reflecting the illumination change and additional information, in response to the value of the context variable being greater than or equal to the block size.
 11. The apparatus of claim 10, wherein the additional information comprises either one or both of a mean of block-specific brightness intensities and information on a difference between a mean of the generated background model and the image frame.
 12. The apparatus of claim 10, wherein for the updating of the generated background model, the processor is further configured to execute the instructions to calculate a value of a time-variant variable in units of blocks starting from an initial image frame input from the camera to the image frame, and to adjust an update intensity of the generated background model using the value of the time-variant variable.
 13. The apparatus of claim 1, wherein for the updating of the generated background model, the processor is further configured to execute the instructions to generate a foreground probability map by calculating a foreground probability for each pixel of the image frame based on the constructed background model, and extract the foreground from the image frame using the generated foreground probability map.
 14. The apparatus of claim 13, wherein for the extracting of the foreground, the processor is further configured to execute the instructions to classify each pixel of the image frame into any one or any combination of any two or more of the foreground, a candidate, and a background based on a comparison of the foreground probability of each pixel of the image frame with a threshold value, and to categorize pixels classified into the candidate into the foreground or the background based on applying watershed segmentation to the pixels classified into the candidate.
 15. A method of detecting a foreground in an image, the method comprising: estimating context information, including a foreground speed, on a scene from an image frame of the image; constructing a background model of the image frame using the estimated context information; correcting a variance of the constructed background model, dependent on whether a value of a context variable based on the foreground speed is greater than or equal to a block size; and detecting a foreground from the image frame based on the variance-corrected constructed background model.
 16. The method of claim 15, wherein the context information further comprises an illumination change.
 17. The method of claim 16, wherein the constructing of the background model comprises generating the background model of the image frame using a background model of a previous image frame based on information on a background motion.
 18. The method of claim 17, wherein the constructing of the background model further comprises updating the constructed background model based on the illumination change.
 19. The method of claim 18, wherein the updating of the background model further comprises, updating a mean of the constructed background model by reflecting the illumination change, in response to a value of the context variable being smaller than a block size, and updating a mean and the variance of the constructed background model by reflecting the illumination change and additional information, in response to the value of the context variable being greater than or equal to the block size.
 20. The method of claim 15, wherein the detecting of the foreground comprises generating a foreground probability map by calculating a foreground probability for each pixel of the image frame based on the constructed background model, and extracting the foreground from the image frame using the generated foreground probability map.
 21. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim
 15. 22. A digital device, comprising: an antenna; a cellular radio configured to transmit and receive data via the antenna according to a cellular communications standard; a touch-sensitive display; a camera configured to capture an image; a memory configured to store instructions; and a processor configured to execute the instructions to receive the image from the camera, estimate context information, including an illumination change, on a scene from an image frame of the image, construct a background model of the image frame using the estimated context information, update a mean of the constructed background model by reflecting the illumination change, dependent on whether a value of a context variable is smaller than a block size, and detect a foreground from the image frame based on the constructed background model. 